AITopics | Blue Earth County

Collaborating Authors

Blue Earth County

Copula Based Fusion of Clinical and Genomic Machine Learning Risk Scores for Breast Cancer Risk Stratification

Aich, Agnideep, Hewage, Sameera, Murshed, Md Monzur

arXiv.org Machine LearningNov-25-2025

Clinical and genomic models are both used to predict breast cancer outcomes, but they are often combined using simple linear rules that do not account for how their risk scores relate, especially at the extremes. Using the METABRIC breast cancer cohort, we studied whether directly modeling the joint relationship between clinical and genomic machine learning risk scores could improve risk stratification for 5-year cancer-specific mortality. We created a binary 5-year cancer-death outcome and defined two sets of predictors: a clinical set (demographic, tumor, and treatment variables) and a genomic set (gene-expression $z$-scores). We trained several supervised classifiers, such as Random Forest and XGBoost, and used 5-fold cross-validated predicted probabilities as unbiased risk scores. These scores were converted to pseudo-observations on $(0,1)^2$ to fit Gaussian, Clayton, and Gumbel copulas. Clinical models showed good discrimination (AUC 0.783), while genomic models had moderate performance (AUC 0.681). The joint distribution was best captured by a Gaussian copula (bootstrap $p=0.997$), which suggests a symmetric, moderately strong positive relationship. When we grouped patients based on this relationship, Kaplan-Meier curves showed clear differences: patients who were high-risk in both clinical and genomic scores had much poorer survival than those high-risk in only one set. These results show that copula-based fusion works in real-world cohorts and that considering dependencies between scores can better identify patient subgroups with the worst prognosis.

copula, dependence, risk score, (15 more...)

arXiv.org Machine Learning

2511.17605

Country:

North America > United States > New York (0.04)
North America > United States > Minnesota > Blue Earth County > Mankato (0.04)
North America > United States > Louisiana > Lafayette Parish > Lafayette (0.04)
(2 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Oncology > Breast Cancer (0.83)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)

Add feedback

Improving Online Rent-or-Buy Algorithms with Sequential Decision Making and ML Predictions

Neural Information Processing SystemsNov-15-2025, 15:21:31 GMT

In this work we study online rent-buy problems as a sequential decision making problem.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > Minnesota > Blue Earth County > Mankato (0.04)
North America > Canada (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.61)

Add feedback

Improving Online Rent-or-Buy Algorithms with Sequential Decision Making and ML Predictions

Neural Information Processing SystemsAug-17-2025, 05:49:10 GMT

In this work we study online rent-buy problems as a sequential decision making problem.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > Minnesota > Blue Earth County > Mankato (0.04)
North America > Canada (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.61)

Add feedback

Bag of Coins: A Statistical Probe into Neural Confidence Structures

Aich, Agnideep, Aich, Ashit Baran, Murshed, Md Monzur, Hewage, Sameera, Wade, Bruce

arXiv.org Machine LearningJul-29-2025

Modern neural networks, despite their high accuracy, often produce poorly calibrated confidence scores, limiting their reliability in high-stakes applications. Existing calibration methods typically post-process model outputs without interrogating the internal consistency of the predictions themselves. In this work, we introduce a novel, non-parametric statistical probe, the Bag-of-Coins (BoC) test, that examines the internal consistency of a classifier's logits. The BoC test reframes confidence estimation as a frequentist hypothesis test: does the model's top-ranked class win 1-v-1 contests against random competitors at a rate consistent with its own stated softmax probability? When applied to modern deep learning architectures, this simple probe reveals a fundamental dichotomy. On Vision Transformers (ViTs), the BoC output serves as a state-of-the-art confidence score, achieving near-perfect calibration with an ECE of 0.0212, an 88% improvement over a temperature-scaled baseline. Conversely, on Convolutional Neural Networks (CNNs) like ResNet, the probe reveals a deep inconsistency between the model's predictions and its internal logit structure, a property missed by traditional metrics. We posit that BoC is not merely a calibration method, but a new diagnostic tool for understanding and exposing the differing ways that popular architectures represent uncertainty.

artificial intelligence, deep learning, machine learning, (19 more...)

arXiv.org Machine Learning

2507.19774

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > Minnesota > Blue Earth County > Mankato (0.04)
North America > United States > Louisiana > Lafayette Parish > Lafayette (0.04)
Asia > India > West Bengal > Kolkata (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

CopulaSMOTE: A Copula-Based Oversampling Approach for Imbalanced Classification in Diabetes Prediction

Aich, Agnideep, Murshed, Md Monzur, Hewage, Sameera, Mayeaux, Amanda

arXiv.org Machine LearningJun-24-2025

Diabetes mellitus poses a significant health risk, as nearly 1 in 9 people are affected by it. Early detection can significantly lower this risk. Despite significant advancements in machine learning for identifying diabetic cases, results can still be influenced by the imbalanced nature of the data. To address this challenge, our study considered copula-based data augmentation, which preserves the dependency structure when generating data for the minority class and integrates it with machine learning (ML) techniques. We selected the Pima Indian dataset and generated data using A2 copula, then applied four machine learning algorithms: logistic regression, random forest, gradient boosting, and extreme gradient boosting. Our findings indicate that XGBoost combined with A2 copula oversampling achieved the best performance improving accuracy by 4.6%, precision by 15.6%, recall by 20.4%, F1-score by 18.2% and AUC by 25.5% compared to the standard SMOTE method. Furthermore, we statistically validated our results using the McNemar test. This research represents the first known use of A2 copulas for data augmentation and serves as an alternative to the SMOTE technique, highlighting the efficacy of copulas as a statistical method in machine learning applications.

artificial intelligence, copula, machine learning, (15 more...)

arXiv.org Machine Learning

2506.17326

Country:

North America > United States > Louisiana > Lafayette Parish > Lafayette (0.04)
North America > United States > West Virginia (0.04)
North America > United States > New York (0.04)
(4 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

A2 Copula-Driven Spatial Bayesian Neural Network For Modeling Non-Gaussian Dependence: A Simulation Study

Aich, Agnideep, Hewage, Sameera, Murshed, Md Monzur, Aich, Ashit Baran, Mayeaux, Amanda, Dey, Asim K., Das, Kumer P., Wade, Bruce

arXiv.org Machine LearningJun-2-2025

In this paper, we introduce the A2 Copula Spatial Bayesian Neural Network (A2-SBNN), a predictive spatial model designed to map coordinates to continuous fields while capturing both typical spatial patterns and extreme dependencies. By embedding the dual-tail novel Archimedean copula viz. A2 directly into the network's weight initialization, A2-SBNN naturally models complex spatial relationships, including rare co-movements in the data. The model is trained through a calibration-driven process combining Wasserstein loss, moment matching, and correlation penalties to refine predictions and manage uncertainty. Simulation results show that A2-SBNN consistently delivers high accuracy across a wide range of dependency strengths, offering a new, effective solution for spatial data modeling beyond traditional Gaussian-based approaches.

artificial intelligence, dependency, machine learning, (12 more...)

arXiv.org Machine Learning

2505.24006

Country:

North America > United States > Montana > Roosevelt County (0.08)
North America > United States > Louisiana > Lafayette Parish > Lafayette (0.05)
North America > United States > West Virginia (0.04)
(4 more...)

Genre: Research Report > New Finding (0.48)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Can Copulas Be Used for Feature Selection? A Machine Learning Study on Diabetes Risk Prediction

Aich, Agnideep, Murshed, Md Monzur, Mayeaux, Amanda, Hewage, Sameera

arXiv.org Machine LearningMay-29-2025

Accurate diabetes risk prediction relies on identifying key features from complex health datasets, but conventional methods like mutual information (MI) filters and genetic algorithms (GAs) often overlook extreme dependencies critical for high-risk subpopulations. In this study we introduce a feature-selection framework using the upper-tail dependence coefficient (λU) of the novel A2 copula, which quantifies how often extreme higher values of a predictor co-occur with diabetes diagnoses (target variable). Applied to the CDC Diabetes Health Indicators dataset (n=253,680), our method prioritizes five predictors (self-reported general health, high blood pressure, body mass index, mobility limitations, and high cholesterol levels) based on upper tail dependencies. These features match or outperform MI and GA selected subsets across four classifiers (Random Forest, XGBoost, Logistic Regression, Gradient Boosting), achieving accuracy up to 86.5% (XGBoost) and AUC up to 0.806 (Gradient Boosting), rivaling the full 21-feature model. Permutation importance confirms clinical relevance, with BMI and general health driving accuracy. To our knowledge, this is the first work to apply a copula's upper-tail dependence for supervised feature selection, bridging extreme-value theory and machine learning to deliver a practical toolkit for diabetes prevention.

artificial intelligence, copula, machine learning, (11 more...)

arXiv.org Machine Learning

2505.22554

Country:

North America > United States > Louisiana > Lafayette Parish > Lafayette (0.04)
North America > United States > West Virginia (0.04)
North America > United States > New York (0.04)
(2 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.67)

Industry: Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.50)

Add feedback

Analysis of Learning-based Offshore Wind Power Prediction Models with Various Feature Combinations

Fang, Linhan, Jiang, Fan, Toms, Ann Mary, Li, Xingpeng

arXiv.org Artificial IntelligenceMar-10-2025

Accurate wind speed prediction is crucial for designing and selecting sites for offshore wind farms. This paper investigates the effectiveness of various machine learning models in predicting offshore wind power for a site near the Gulf of Mexico by analyzing meteorological data. After collecting and preprocessing meteorological data, nine different input feature combinations were designed to assess their impact on wind power predictions at multiple heights. The results show that using wind speed as the output feature improves prediction accuracy by approximately 10% compared to using wind power as the output. In addition, the improvement of multi-feature input compared with single-feature input is not obvious mainly due to the poor correlation among key features and limited generalization ability of models. These findings underscore the importance of selecting appropriate output features and highlight considerations for using machine learning in wind power forecasting, offering insights that could guide future wind power prediction models and conversion techniques.

prediction, wind power, wind speed, (11 more...)

arXiv.org Artificial Intelligence

2503.13493

Country:

North America > Mexico (0.35)
Atlantic Ocean > Gulf of Mexico (0.25)
North America > United States > Texas > Harris County > Houston (0.05)
(9 more...)

Genre: Research Report > New Finding (0.69)

Industry: Energy > Renewable > Wind (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.98)

Add feedback

Fine-Tuning Federated Learning-Based Intrusion Detection Systems for Transportation IoT

Akinie, Robert, Gyimah, Nana Kankam Brym, Bhavsar, Mansi, Kelly, John

arXiv.org Artificial IntelligenceFeb-9-2025

The rapid advancement of machine learning (ML) and on-device computing has revolutionized various industries, including transportation, through the development of Connected and Autonomous Vehicles (CAVs) and Intelligent Transportation Systems (ITS). These technologies improve traffic management and vehicle safety, but also introduce significant security and privacy concerns, such as cyberattacks and data breaches. Traditional Intrusion Detection Systems (IDS) are increasingly inadequate in detecting modern threats, leading to the adoption of ML-based IDS solutions. Federated Learning (FL) has emerged as a promising method for enabling the decentralized training of IDS models on distributed edge devices without sharing sensitive data. However, deploying FL-based IDS in CAV networks poses unique challenges, including limited computational and memory resources on edge devices, competing demands from critical applications such as navigation and safety systems, and the need to scale across diverse hardware and connectivity conditions. To address these issues, we propose a hybrid server-edge FL framework that offloads pre-training to a central server while enabling lightweight fine-tuning on edge devices. This approach reduces memory usage by up to 42%, decreases training times by up to 75%, and achieves competitive IDS accuracy of up to 99.2%. Scalability analyses further demonstrates minimal performance degradation as the number of clients increase, highlighting the framework's feasibility for CAV networks and other IoT applications.

artificial intelligence, edge device, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2502.06099

Country:

North America > United States > South Carolina (0.04)
North America > United States > North Carolina > Guilford County > Greensboro (0.04)
North America > United States > Minnesota > Blue Earth County > Mankato (0.04)

Genre: Research Report > Promising Solution (0.88)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Adaptive Object Detection for Indoor Navigation Assistance: A Performance Evaluation of Real-Time Algorithms

Pratap, Abhinav, Kumar, Sushant, Chakravarty, Suchinton

arXiv.org Artificial IntelligenceJan-30-2025

-- This study addresses the critical need for accurate and efficient object detection in assistive technologies for visually impaired individuals. We systematically evaluate the performance of four prominent real-time object detection algorithms--YOLO, SSD, Faster R-CNN, and Mask R-CNN--within the context of indoor navigation assistance. Our analysis, conducted on the Indoor Objects Detection dataset, focuses on key parameters including detection accuracy, processing speed, and adaptability to the unique challenges of indoor environments. This research contributes to a deeper understanding of adaptive machine learning applications that can significantly improve indoor navigation solutions for the visually impaired, promoting inclusivity and accessibility. In today's technology-driven society, there is an increasing emphasis on enhancing accessibility for visually impaired individuals.

application, detection, faster r-cnn, (13 more...)

arXiv.org Artificial Intelligence

2501.18444

Country:

North America > United States > Minnesota > Blue Earth County > Mankato (0.04)
Asia > India > Tamil Nadu > Chennai (0.04)
Asia > India > Maharashtra (0.04)
Asia > China (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback